Post-Processing¶
Storage¶
- Ensure that data is backed up to prevent data loss during post-processing.
- Protect against accidental writes / deletions:
- Linux:
chmod -R a-w
- Linux:
- Store backup in a safe place. I.e. do not travel with backup disks.
- Protect against accidental writes / deletions:
- Control access to data
- Protect against unauthorized access:
- Linux:
chmod -R o-rwx
- Linux:
- Protect against unauthorized access:
- Do not operate with the raw (or even worse backup) data. Separate corpus from derived data/features.
- Also helps to restart automated processing from scratch in case of errors because directories are separated.
Procedure¶
Always: Automate as much as possible.
Data Extraction¶
- Extract trials which fulfill quality criteria:
- Recording complete?
- Use validation script
- Ignore trials/recordings where issues occurred during the recording session (as indicated in the recording log).
- Extract regions of interest from videos etc.
- Conversion to target format:
- Decide on target formats (video, audio, CSV? etc) depending on usage patterns (h264+aac is windows, mac and linux compatible).
- Generate multiple resolutions/sizes in case it helps quick annotation/processing
Synchronization¶
- Think about the required accuracy of synchronization.
- Think about how to validate the accuracy.
Annotation¶
- Define and document an annotation scheme based on literature and with relation to hypotheses.
- Think about how to proof annotation reliability (interrator agreement etc.)
- Use multiple persons to annotate the data in parallel and isolation.
- Train annotators.
Technical Solutions¶
Have a look at the Dataset Processing Project for some useful scripts.
Synchronization¶
- Automate audio and video analyses (using existing tools)
- Blackframe detection to synchronize videos containing black frames (e.g. ffmpeg)
- Detection of known sound patterns (e.g. clapperboard, robot utterance, via praat and cross-correlation to reference audio signal)
- Estimation of temporal offsets between cameras (cross-correlation)
- Clapperboard detection using Vicon markers (distance of two markers)
- Align videos to system logs by e.g. recording one audio channel via middleware or more sophisticated solutions
Audio Processing¶
- sox
Automation/Orchestration¶
- Scripting languages (bash, Python)